Search results for "Methodology Article"

showing 10 items of 20 documents

OMICfpp: a fuzzy approach for paired RNA-Seq counts

2019

© The Author(s) 2019.

0106 biological scienceslcsh:QH426-470Pipeline (computing)lcsh:BiotechnologyRNA-SeqBinomial testSample (statistics)Biologyoncología médicaMedical Oncology01 natural sciencesFuzzy logicSet (abstract data type)03 medical and health sciencesUser-Computer InterfaceSoftwarelcsh:TP248.13-248.65GeneticsHumansCàncer030304 developmental biologyOrdered weight average0303 health sciencesbusiness.industrySequence Analysis RNAMethodology ArticleHigh-Throughput Nucleotide SequencingPattern recognitionColorectal cancerlcsh:Genetics3201.01 OncologíatranscriptomaRandomization distributionRNAArtificial intelligenceDNA microarraybusinessColorectal NeoplasmsTranscriptome010606 plant biology & botanyBiotechnology
researchProduct

CLOVE: classification of genomic fusions into structural variation events

2017

Background A precise understanding of structural variants (SVs) in DNA is important in the study of cancer and population diversity. Many methods have been designed to identify SVs from DNA sequencing data. However, the problem remains challenging because existing approaches suffer from low sensitivity, precision, and positional accuracy. Furthermore, many existing tools only identify breakpoints, and so not collect related breakpoints and classify them as a particular type of SV. Due to the rapidly increasing usage of high throughput sequencing technologies in this area, there is an urgent need for algorithms that can accurately classify complex genomic rearrangements (involving more than …

0301 basic medicineGenomicsBiologycomputer.software_genrelcsh:Computer applications to medicine. Medical informaticsBiochemistryChromosomesDNA sequencingSet (abstract data type)Structural variationUser-Computer Interface03 medical and health sciencesStructural BiologyEscherichia coliHumansCopy-number variationMolecular Biologylcsh:QH301-705.5InternetMethodology ArticleApplied MathematicsBreakpointGenomic rearrangementsDNAGenomicsStructural variationsComputer Science ApplicationsIdentification (information)030104 developmental biologylcsh:Biology (General)Nucleic Acid ConformationGraph (abstract data type)lcsh:R858-859.7Data miningcomputerAlgorithmsBMC Bioinformatics
researchProduct

Automated selection of homologs to track the evolutionary history of proteins

2018

Background The selection of distant homologs of a query protein under study is a usual and useful application of protein sequence databases. Such sets of homologs are often applied to investigate the function of a protein and the degree to which experimental results can be transferred from one organism to another. In particular, a variety of databases facilitates static browsing for orthologs. However, these resources have a limited power when identifying orthologs between taxonomically distant species. In addition, in some situations, for a given query protein, it is advantageous to compare the sets of orthologs from different specific organisms: this recursive step-wise search might give …

0301 basic medicineProteomeComputer scienceComputational biologyWeb toollcsh:Computer applications to medicine. Medical informaticsBiochemistryHomology (biology)Evolution Molecular03 medical and health sciences0302 clinical medicineProtein sequencingStructural BiologyHomologous chromosomeHumansDatabases ProteinMolecular Biologylcsh:QH301-705.5OrganismProtein functionMethodology ArticleApplied MathematicsProteinsA proteinComputer Science ApplicationsHomologyEvolutionary path030104 developmental biologyComputingMethodologies_PATTERNRECOGNITIONlcsh:Biology (General)Proteomelcsh:R858-859.7DNA microarraySoftware030217 neurology & neurosurgeryBMC Bioinformatics
researchProduct

A hybrid short read mapping accelerator

2013

Background The rapid growth of short read datasets poses a new challenge to the short read mapping problem in terms of sensitivity and execution speed. Existing methods often use a restrictive error model for computing the alignments to improve speed, whereas more flexible error models are generally too slow for large-scale applications. A number of short read mapping software tools have been proposed. However, designs based on hardware are relatively rare. Field programmable gate arrays (FPGAs) have been successfully used in a number of specific application areas, such as the DSP and communications domains due to their outstanding parallel data processing capabilities, making them a compet…

:Engineering::Computer science and engineering [DRNTU]GenomeComputer sciencebusiness.industryApplied MathematicsMethodology ArticleChromosome MappingSequence Analysis DNABiochemistryComputer Science ApplicationsSoftwareComputer engineeringStructural BiologySensitivity (control systems)DNA microarraybusinessField-programmable gate arrayAlgorithmMolecular BiologySequence AlignmentDigital signal processingAlgorithmsSoftwareReference genomeBMC Bioinformatics
researchProduct

FASTA/Q data compressors for MapReduce-Hadoop genomics: space and time savings made easy

2021

Abstract Background Storage of genomic data is a major cost for the Life Sciences, effectively addressed via specialized data compression methods. For the same reasons of abundance in data production, the use of Big Data technologies is seen as the future for genomic data storage and processing, with MapReduce-Hadoop as leaders. Somewhat surprisingly, none of the specialized FASTA/Q compressors is available within Hadoop. Indeed, their deployment there is not exactly immediate. Such a State of the Art is problematic. Results We provide major advances in two different directions. Methodologically, we propose two general methods, with the corresponding software, that make very easy to deploy …

Big DataFASTQ formatComputer scienceBig data02 engineering and technologycomputer.software_genrelcsh:Computer applications to medicine. Medical informaticsBiochemistry03 medical and health sciencesSoftwareStructural BiologySpark (mathematics)0202 electrical engineering electronic engineering information engineeringData_FILESMapReduceMapReduce; hadoop; sequence analysis; data compressionMolecular Biologylcsh:QH301-705.5030304 developmental biologyFile system0303 health sciencesSettore INF/01 - InformaticaDatabasebusiness.industryMethodology ArticleApplied MathematicsSequence analysisGenomicsData compression; Hadoop; MapReduce; Sequence analysis; Algorithms; Big Data; Data Compression; Genomics; SoftwareComputer Science Applicationslcsh:Biology (General)Software deploymentHadoopData compressionlcsh:R858-859.7020201 artificial intelligence & image processingState (computer science)businesscomputerAlgorithmsSoftwareData compressionBMC Bioinformatics
researchProduct

A motif-independent metric for DNA sequence specificity

2011

Abstract Background Genome-wide mapping of protein-DNA interactions has been widely used to investigate biological functions of the genome. An important question is to what extent such interactions are regulated at the DNA sequence level. However, current investigation is hampered by the lack of computational methods for systematic evaluating sequence specificity. Results We present a simple, unbiased quantitative measure for DNA sequence specificity called the Motif Independent Measure (MIM). By analyzing both simulated and real experimental data, we found that the MIM measure can be used to detect sequence specificity independent of presence of transcription factor (TF) binding motifs. We…

Biologylcsh:Computer applications to medicine. Medical informaticsDNA-binding proteinGenomeBiochemistryDNA sequencingCell Line03 medical and health scienceschemistry.chemical_compound0302 clinical medicineStructural BiologyHumansTranscription factorMolecular Biologylcsh:QH301-705.5Sequence Specificity Epigenomics Bioinformatics030304 developmental biologyEpigenomicsGenetics0303 health sciencesBase SequenceSettore INF/01 - InformaticaGenome HumanApplied MathematicsMethodology ArticleDNAComputer Science ApplicationsDNA-Binding Proteinschemistrylcsh:Biology (General)lcsh:R858-859.7Human genomeDNA microarray030217 neurology & neurosurgeryDNAAlgorithmsSoftwareGenome-Wide Association StudyProtein BindingTranscription FactorsBMC Bioinformatics
researchProduct

PVAmpliconFinder: a workflow for the identification of human papillomaviruses from high-throughput amplicon sequencing

2019

Abstract Background The detection of known human papillomaviruses (PVs) from targeted wet-lab approaches has traditionally used PCR-based methods coupled with Sanger sequencing. With the introduction of next-generation sequencing (NGS), these approaches can be revisited to integrate the sequencing power of NGS. Although computational tools have been developed for metagenomic approaches to search for known or novel viruses in NGS data, no appropriate tool is available for the classification and identification of novel viral sequences from data produced by amplicon-based methods. Results We have developed PVAmpliconFinder, a data analysis workflow designed to rapidly identify and classify kno…

Computer scienceComputational biologylcsh:Computer applications to medicine. Medical informaticsBiochemistryWorkflowUser-Computer Interface03 medical and health sciencessymbols.namesakeStructural BiologyHumansVirus discoverylcsh:QH301-705.5PapillomaviridaeMolecular BiologyThroughput (business)PhylogenyAmplicon sequencing030304 developmental biologySanger sequencing0303 health sciencesBiological data030306 microbiologyMethodology ArticleApplied MathematicsHigh-Throughput Nucleotide SequencingPapillomavirusAmpliconComputer Science ApplicationsIdentification (information)Workflowlcsh:Biology (General)MetagenomicsDNA ViralAmplicon sequencingsymbolslcsh:R858-859.7Primer (molecular biology)DNA microarrayBMC Bioinformatics
researchProduct

Topology testing of phylogenies using least squares methods

2006

[Background] The least squares (LS) method for constructing confidence sets of trees is closely related to LS tree building methods, in which the goodness of fit of the distances measured on the tree (patristic distances) to the observed distances between taxa is the criterion used for selecting the best topology. The generalized LS (GLS) method for topology testing is often frustrated by the computational difficulties in calculating the covariance matrix and its inverse, which in practice requires approximations. The weighted LS (WLS) allows for a more efficient albeit approximate calculation of the test statistic by ignoring the covariances between the distances.

EvolutionInverseHepacivirusBiologyTopologyDNA MitochondrialLeast squares methodsLeast squaresEvolution MolecularGoodness of fit:CIENCIAS DE LA VIDA::Genética ::Ingeniería genética [UNESCO]Test statisticQH359-425AnimalsHumansLeast-Squares AnalysisPhylogenyEcology Evolution Behavior and SystematicsStatisticPhylogenetic treeCovariance matrixUNESCO::CIENCIAS DE LA VIDA::Genética ::Ingeniería genéticaMethodology ArticlePhylogenies; Least squares methodsClassificationHepatitis CTree (graph theory)Sea UrchinsPhylogenies
researchProduct

MGFM: a novel tool for detection of tissue and cell specific marker genes from microarray gene expression data

2015

Background Identification of marker genes associated with a specific tissue/cell type is a fundamental challenge in genetic and cell research. Marker genes are of great importance for determining cell identity, and for understanding tissue specific gene function and the molecular mechanisms underlying complex diseases. Results We have developed a new bioinformatics tool called MGFM (Marker Gene Finder in Microarray data) to predict marker genes from microarray gene expression data. Marker genes are identified through the grouping of samples of the same type with similar marker gene expression levels. We verified our approach using two microarray data sets from the NCBI’s Gene Expression Omn…

Genetic MarkersCancer ResearchMicroarraysBiologyMarker genesWeb BrowserProteomicsMarker geneBioconductorGeneticsGeneGenetic Association StudiesGeneticsMicroarray analysis techniquesMethodology ArticleGene Expression ProfilingComputational BiologyReproducibility of Results3. Good healthGene expression profilingSamplesGene OntologyGenetic markerOrgan SpecificityDNA microarrayBiotechnologyBMC Genomics
researchProduct

Selection of suitable housekeeping genes for expression analysis in glioblastoma using quantitative RT-PCR

2009

Abstract Background Considering the broad variation in the expression of housekeeping genes among tissues and experimental situations, studies using quantitative RT-PCR require strict definition of adequate endogenous controls. For glioblastoma, the most common type of tumor in the central nervous system, there was no previous report regarding this issue. Results Here we show that amongst seven frequently used housekeeping genes TBP and HPRT1 are adequate references for glioblastoma gene expression analysis. Evaluation of the expression levels of 12 target genes utilizing different endogenous controls revealed that the normalization method applied might introduce errors in the estimation of…

Hypoxanthine PhosphoribosyltransferaseCell typeLung Neoplasmslcsh:QH426-470Journal ClubCellGene ExpressionComputational biologyBiologyBioinformaticsModels BiologicalVariable ExpressionReference genesExpression analysisGene expressionmedicineHumansStudent’s Sectionlcsh:QH573-671Molecular BiologyGeneSelection (genetic algorithm)GeneticsRegulation of gene expressionGenes Essentiallcsh:CytologyBrain NeoplasmsReverse Transcriptase Polymerase Chain ReactionMethodology ArticleGeneral NeuroscienceReference StandardsTATA-Box Binding Proteinmedicine.diseaseHousekeeping geneDNA-Binding ProteinsGene Expression Regulation Neoplasticlcsh:GeneticsNEOPLASIAS DO SISTEMA NERVOSOReal-time polymerase chain reactionmedicine.anatomical_structureGlioblastomaGlioblastomaAnnals of Neurosciences
researchProduct